Adaptation of Prosody in Speech Synthesis by Changing Command Values of the Generation Process Model of Fundamental Frequency

نویسندگان

Keikichi Hirose

Keiko Ochi

Ryusuke Mihara

Hiroya Hashimoto

Daisuke Saito

Nobuaki Minematsu

چکیده

A method was developed to adapt prosody to a new speaker/style in speech synthesis. It is based on predicting differences between target and original speakers/styles and applying them to the original one. Differences in fundamental frequency (F0) contours are represented in the framework of the generation process model; differences in the command magnitudes/amplitudes. While the original one requires a certain amount of training corpus, while corpus for training command differences can be small. Furthermore, in the case of style adaptation, it is not necessarily the corpus being uttered by the same speaker of the original style. Speech synthesis was conducted using HMM-based speech synthesis system, where prosody was controlled by the method. Listening experiments on synthetic speech with style adaptation and voice conversion both showed the validity of the method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using FO Contour Generation Process Model for Improved and Flexible Control of Prosodie Features in HMM-based Speech Synthesis

Generation process model of fundamental frequency contours known as Fujisaki's model is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information included in the utterance. Therefore, by controlling fundamental frequency contours in the framework of the generation process model, a mo...

متن کامل

Fundamental Frequency Contour Reshaping in HMM-based Speech Synthesis and Realization of Prosodic Focus Using Generation Process Model

Frame-by-frame representation is not appropriate for prosodic features, which are tightly related to speech units spreading a wide time span, such as words, phrases and so on. This causes an inherit problem in fundamental frequency (F0) contour generation by HMM-based speech synthesis. Our formerlydeveloped method, which modify generated F0 contours in the framework of the generation process mo...

متن کامل

Analytical Study on Fundamental Frequency Contours of Thai Expressive Speech Using Fujisaki’s Model

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Corpus-based synthesis of fundamental frequency contours with various speaking styles from text using F0 contour generation process model

A corpus-based method of generating fundamental frequency (F0) contours of various speaking styles from text was developed. Instead of directly predicting F0 values, the method predicts command values of the F0 contour generation process model. Because of the model constraint, the resulting F0 contour keeps certain naturalness even when the prediction is done incorrectly. The method includes a ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Adaptation of Prosody in Speech Synthesis by Changing Command Values of the Generation Process Model of Fundamental Frequency

نویسندگان

چکیده

منابع مشابه

Using FO Contour Generation Process Model for Improved and Flexible Control of Prosodie Features in HMM-based Speech Synthesis

Fundamental Frequency Contour Reshaping in HMM-based Speech Synthesis and Realization of Prosodic Focus Using Generation Process Model

Analytical Study on Fundamental Frequency Contours of Thai Expressive Speech Using Fujisaki’s Model

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

Corpus-based synthesis of fundamental frequency contours with various speaking styles from text using F0 contour generation process model

عنوان ژورنال:

اشتراک گذاری